Identification of transcriptional networks controlling leaf sheath growth in Sorghum bicolor

Objectives The objective of this data set was to identify transcriptional networks that control elongation of seedling leaf sheaths in the C4 grass Sorghum bicolor. One motivation was that leaf sheaths are a primary constituent of stems in grass seedlings; therefore, genes that control growth of this organ are important contributors to successful transition from the seedling stage to the mature plant stage and, ultimately, crop success. Since diurnal rhythms contribute to regulation of signaling networks responsible for growth, a time course representing the late afternoon and early evening was anticipated to pinpoint important control genes for stem growth. Ultimately, the expected outcome was discovery of transcript networks that integrate internal and external signals to fine tune leaf sheath growth and, consequently, plant height. Data description The data set is RNAseq profiling of upper leaf sheaths collected from wild type Sorghum bicolor (BTx623 line) plants at four-hour intervals from 12.5 h after dawn to 20 h after dawn. Global transcript levels in leaves were determined by deep sequencing of mRNA from four individual seedlings at each time point. This data set contains sequences representing the spectrum of mRNAs from individual genes. This data set enables detection of significant changes in gene-level expression caused by the progression of the day from late afternoon to the middle of the night. This data set is useful to identify gene expression networks regulating growth in the leaf sheath, an organ that is a major contributor to the sorghum seedling stem and defines seedling height.

leaf sheath wraps around emerging immature leaves and internodes from which leaves originate.The stem in young seedlings is primarily composed of successive leaf sheaths and, therefore, these organs are critical for structural support and to establish plant height.The amount of elongation occurring in the sheath depends internal and external signals that are incompletely understood.The objective of this data set was to identify transcriptional networks that control elongation of the leaf sheath in Sorghum bicolor as a representative of the Poaceae family.Since diurnal rhythms are well known to contribute to regulation of growth-related signaling networks,

Identification of transcriptional networks controlling leaf sheath growth in Sorghum bicolor
Samuel De Riseis 1,2 and Frank G. Harmon 1,2* a time course representing the late afternoon and early evening was expected to highlight likely control genes for stem growth.Ultimately, the expected outcome was discovery of transcript networks that integrate internal and external signals to fine tune leaf sheath growth and, consequently, plant height.These data are useful for engineering plant stature and to alter the sensitivity of growth in cereal crops to the consequences of environmental change.

Data description
The data set is RNAseq of the top half of the leaf 5 sheath from individual wild type Sorghum bicolor (BTx623 inbred, carrying the male sterile 8 (ms8) mutation [1] seedlings.Plants were grown in 4-inch peat pots filled with SuperSoil (The Scotts Company) soil under greenhouse conditions of 16-hour days and 8-hour nights.Natural sunlight was supplemented with LumiGrow Pro325 LEDs.Daytime temperature was set to 26°C and nighttime temperature was set to 20°C.After 14 days, the plants were transferred to Percival growth chambers set to 16 hours of white LED light, at 360 µmol photons m -1 s -1 , and 8 hours of darkness, with daytime temperature set to 26°C and nighttime temperature set to 22°C.Samples were dissected from 21-day-old plants by removing leaves 1-4, cutting the leaf 5 sheath in the middle, and the sheath tissue unwrapped from the underlying leaves.Samples were immediately frozen in liquid nitrogen.
Plants were sampled at 12.5, 16, and 20 hours after dawn in three biological replicates for a total of 4 samples for each time point, totaling 12 samples.Total RNA for each individual was isolated with the Qiagen Plant RNeasy Kit (www.qiagen.com)according to manufacturer's recommendations.Residual genomic DNA was removed by on-column digestion with the Qiagen RNase-Free DNA Set (www.qiagen.com)according to manufacturer's recommendations.Sequencing library preparation and Illumina NovaSeq 6000 (www.illumina.com)next generation sequencing was done by Novogene Corporation Inc. (www.novogene.com/us-en/).Nondirectional libraries were prepared with the "NEBNext Ultra II RNA Library Prep Kit for Illumina" from New England Biolabs (www.neb.com) with messenger RNA purified from total RNA using poly-T oligo-attached magnetic beads.Library concentration quantification used Qubit and real-time PCR.Fragment size distribution was determined by Bioanalyzer (Agilent Technologies, www.agilent.com).Pooled libraries were pair-end 150 base pair sequenced on one lane of Illumina NovoSeq 6000.Raw reads were filtered for adapter sequences and low-quality reads based on 1) reads containing N > 10% (N represents the base cannot be determined) or 2) reads containing low quality (Q-score < = 5) bases which is over 50% of the total bases.5' adapter sequence: 5'-AGATCGGAAGAGC-GTCGTGTAGGGAAAGAGTGTAGATCTCGGTG-GTCGCCGTATCATT-3' and 3' adapter sequence:

Limitations
• The leaf sheath is a specific leaf structure.This should be considered if these data are interpreted for other parts of the leaf, as well as for different plant tissues.• The sample collection strategy taken here, which focused on gene expression in the afternoon to evening hours, may result in under representation of transcripts primarily expressed at other parts of the day.• The RNAseq libraries were non-stranded.As a consequence, these libraries represent overall gene expression, but do not discriminate between sense and antisense transcripts.

5 '
-GATCGGAAGAGCACACGTCTGAACTCCAGT-C AC G GATGAC TATC TC GTATG C C GTC T TC T-GCTTG-3' .These pass-filter reads were demultiplexed according to the 12 biological samples into forward read and reverse read FASTQ files.The total number of pass filter reads for each data set were as follows: 63,620,858 in data set 1, 44,181,504 in data set 2, 39,214,778 in data set 3, 61,216,552 in data set 4, 45,273,096 in data set 5, 54,439,560 in data set 6, 55,008,376 in data set 7, 62,795,986 in data set 8, 41,410,692 in data set 9, 47,362,084 in data set 10, 45,503,020 in data set 11, and 54,403,120 in data set 12. The FASTQ files were deposited at the National Center for Biotechnology (NCBI) as BioProject ID PRJNA1008758 (https://identifiers.org/ ncbi/bioproject:PRJNA1008758) [2].

Table 1
Overview of data files/data sets